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Abstract 

Algorithms based on the particle flow approach are becoming increasingly 
utilized in collider experiments due to their superior jet energy and missing 
energy resolution compared to the traditional calorimeter-based measure- 
ments. Such methods have been shown to work well in environments with 
low occupancy of particles per unit of calorimeter granularity. However, 
at higher instantaneous luminosity or in detectors with coarse calorimeter 
segmentation, the overlaps of calorimeter energy deposits from charged and 
neutral particles significantly complicate particle energy reconstruction, re- 
ducing the overall energy resolution of the method. We present a technique 
designed to resolve overlapping energy depositions of spatially close parti- 
cles using a statistically consistent probabilistic procedure. The technique is 
nearly free of ad-hoc corrections, improves energy resolution, and provides 
new important handles that can improve the sensitivity of physics analyses: 
the uncertainty of the jet energy on event-by-event basis and the estimate of 
the probability of a given particle hypothesis for a given detector response. 
Applied to the hadronic tau reconstruction using the CDF-II detector at 
Fermilab, the method has demonstrated reliable and robust performance. 



1. Introduction to the Particle Flow Algorithm 

Accurate measurement of energy of hadronic jets is critical for precision 
verification of the Standard Model (SM) as well as searches for new physics 
at current and future collider experiments. A standard jet energy measure- 
ment technique relies on clustering spatially close energy depositions in the 
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calorimeter, the detector designed to measure the energy of particles that pro- 
duce electromagnetic or hadronic showers in the absorber material. Given 
that on average about 70% of a typical jet energy is carried by particles in- 
teracting hadronicalfyQ (mostly ir 1 * 1 , but also K ± , protons, neutrons), jet 
energy measurement resolution is driven by the accuracy in reconstructing 
energy of hadronic showers. While the energy of electromagnetic showers 
can be measured very well, large fluctuations in the development of hadronic 
showers lead to a significantly lower precision^ The non-equal response of 
the non-compensating calorimeters to electromagentic and hadronic shower^] 
further biases the overall jet energy scale and degrades resolution. Special 
corrections accounting for non equal response can only partially recover this 
reduction in resolution. While the presence of many particles in a jet av- 
erages out fluctuations in the measurement of energy of individual hadronic 
showers, jet energy resolution remains poor for jets of low (~10-30 GeV) and 
moderate (~30-60 GeV) energies. Incidentally, resolution of low-to-moderate 
energy jets has a strong impact on sensitivity of many physics analyses, from 
electroweak precision measurements to searches for Supersymmetry or Higgs 
in bb and rr channels, motivating development of improved jet energy mea- 
surement techniques. Furthermore, as mismeasurements of the jet energy 
bias the calculation of the missing transverse energy (fir) in an event, a bet- 
ter jet energy resolution results in improved fii resolution, a key discriminant 
in many searches for new physics. 

A signficant improvement in the jet energy resolution at hadron collider 
experiments has been achieved with the deployment of a technique known as 
the Particle Flow Algorithm (PEA). PFA achieves better jet energy resolution 
by reconstructing and measuring energies of individual particles in a jet using 
information from several detector sub-systems. For example, momentum of 
a charged hadron can be measured much more accurately using the tracking 
system (except for the case of very high transverse momenta, which is not 
relevant for this discussion), than in the calorimeter. This allows one to 
replace the less accurate calorimeter measurement of the energy carried by 



Hhe remaining 30% is mainly due to neutral pions decaying to pairs of photons, which 
produce electromagnetic showers. 

2 A typical example is the CDF calorimeter, which has good electromagentic calorimeter 
resolution 5E/E ~ 0.135/vi5 while the response to stable hadrons, e.g. charged pions, is 
substantially less precise 5E/E ~ 0.5/VE. 

3 E.g., main calorimeter systems at ATLAS, CDF, and CMS are all non-compensating. 
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charged hadrons in the PFA jet energy calculation with: 



tracks j's n 

where the first term is the energy of the charged particles in the jet, the 
second term accounts for energy of photons accurately measured in the elec- 
tromagnetic calorimeter, and E n is the energy of stable neutral hadrons, 
e.g. neutrons or if^'s, which still relies on the hadron calorimeter. The cor- 
responding relative jet energy resolution can be written in terms of single 
particle relative resolutions as: 

jet jei \tracks " K 7's ' n's " / 

Note that only the last term depends on the potentially poor calorimeter 
resolution for energy of hadronic showers. However, because the average 
fraction of the jet energy carried by stable neutral hadrons is on average only 
around 10%, it's contribution to the overall jet energy uncertainty is strongly 
suppressed by Yl E n /Ej et . With the remaining 90% of energy accurately 
measured either in the tracker or in the electromagnetic calorimeter, the PFA- 
based jet energy reconstruction can substantially outperform the traditional 
calorimeter-only based measurements. Furthermore, the bias in the energy 
scale related to calorimeter non-compensation effects is significantly reduced 
as it is only present in the suppressed third term and can be easily corrected. 

Apart from an obvious pre-requisite of highly efficient tracking, the per- 
formance of a PFA-based reconstruction in a realistic setting depends crit- 
ically on one's ability to correctly identify and separate calorimeter energy 
depositions from spatially close particles. One example illustrating the issue 
is an overlap of energy deposits in the calorimeter due to a charged pion 
and a neutron. In this case one has to "guess" the fraction of the measured 
calorimeter energy deposited by the charged pion, so that the excess can be 
attributed to a neutral hadron. The dependence of the jet energy resolu- 
tion on the overlap effects is sometimes parameterized by amending Eq. (J5]) 
with the so called "confusion term" [1] cr? on j. The relative importance of 
the confusion term depends on the power of the algorithm and the detector 
design features, but it generally increases with the coarser calorimeter seg- 
mentation and higher particle densities. In extreme cases, the large size of 
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the confusion term can completely eliminate the advantages of the PEA over 
traditional calorimeter-based measurement. 

PF-based algorithms were successfully implemented at LEP in the 1990's 
[2] and have been pursued in developing physics program at the International 
Linear Collider (ILC) [5]. At hadron collider experiments, a simplified version 
of a PFA-based algorithm was implemented for reconstructing hadronically 
decaying tau leptons at CDF at the end of Run I jl] providing strong im- 
provement in hadronic tau jet energy resolution. It was further improved and 
used at CDF for Run II analyses [5]. A more comprehensive implementation 
of the same technique [H] has been shown to improve generic jet resolution at 
CDF compared to calorimeter-only reconstruction. However, the confusion 
term associated with frequent energy overlaps owing to the coarse segmenta- 
tion of the CDF calorimeter towers has allowed only a limited improvement. 
A complete PFA algorithm developed by the CMS experiment [7] has allowed 
for a strong improvement in jet energy and missing transverse energy scale 
and resolution. The CMS detector is nearly ideally suited for PFA-based 
reconstruction due to the fine granularity of the electromagnetic calorimeter 
and the longitudinal profiling of hadronic showers, which improves their spa- 
tial resolution. However, the series of the "High Luminosity LHC" upgrades 
is expected to result in signifciant increases in particle occupancies per event. 
Maintaining high performance of the PFA-based reconstruction in the new 
regime requires developing techniques capable of efficiently resolving energy 
overlaps. 

In this paper, we discuss the challenges and implications of deploying a 
PFA-based reconstruction in an environment with frequent energy overlaps 
(Section II). In Section III we present a technique designed to resolve the 
overlapping energy depositions of spatially close particles using a statisti- 
cally consistent probabilistic procedure. In addition to improving the energy 
resolution, the technique allows for combining measurements from multiple 
detectors, as opposed to "substituting" one measurement with another in 
existing algorithms. It is nearly free of ad-hoc corrections thus minimizing 
distortions due to the discontinuities of the correction functions. The algo- 
rithm provides additional handles, such as the measurement of jet energy 
uncertainty on a jet-by-jet basis and the measure of the overall consistency 
of the measurement, improving the sensitivity of physics analyses. In Section 
IV, we describe implementation of this technique for reconstructing hadronic 
tau jets at CDF and illustrate its performance with real data in Section V. 
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2. Challenges of the High Occupancy Environment 

Reconstruction of hadronically decaying tau jets using the CDF-II de- 
tector is a good example of a problem with frequent overlaps of energy de- 
posits by nearby particles. The CDF calorimeter has projective tower geom- 
etry with azimuthal segmentation <fi = 15° and pseudorapidity segmentation 
i] « 0.1 and provides very limited information about the lateral and longi- 
tudinal shower profiles^} With a typical angular size of a hadronic tau jet 
being of the order of 0.05-0.1 rad, there is a substantial probability for several 
or even all particles within the tau jet to cross the face of the calorimeter 
within the boundaries of a single calorimeter tower. Treatment of frequent 
energy overlaps is therefore a key consideration in designing a PFA-based 
reconstruction at CDF. 

To set the stage, we need to briefly describe the sub-detector systems 
used in tau reconstruction and identification, a full description of the CDF 
detector is available elsewhere [8] . The CDF tracking system provides nearly 
100% efficient tracking within the pseudorapidity range of \r)\ < 1, which is 
relevant to tau reconstruction. It's main element is the Central Outer Tracker 
(COT), a drift chamber that covers radii from 0.4 m to 1.37 m, providing 
momentum resolution of 5pt/Pt ~ 0.0017(GeV/c) _1 . If available, hits from 
the silicon vertex detector (SVX) are added to the COT information fur- 
ther improving the resolution. Central electromagnetic (CEM) and hadronic 
(CHA) calorimeters cover the pseudorapidity region of \q\ < 1.1. CEM is 
a lead-scintillator calorimeter with resolution 5E T /E T = 0.135/a/Et © 0.02. 
CHA is an iron-scintillator calorimeter with the single pion energy resolution 
of 0.5/a/E^ © 0.03. Both calorimeters have a projective tower geometry with 
tower size A</> x Ai] w 15° x 0.1 and neither of the calorimeters measures ei- 
ther the longitudinal or lateral shower profile. The Shower Maximum (CES) 
detector, consisting of a set of strip-wire chambers embedded inside the CEM 
at the expected maximum of the electromagnetic shower profile, enables mea- 
surement of the position of electromagnetic showers with accuracy of a few 
mm by reconstructing clusters formed by strip and wires. While rarely used 
to measure energy of the electromagnetic showers, CES cluster's pulse height 



4 As discussed further in the text, there is a strip-wire chamber embeded inside the elec- 
tromagnetic calorimeter at ~ 6Xo, which allows for rough measurements of the latteral 
profile in some cases. Logitudianl profile information is limited to two energy measure- 
ments for deposits in the electromagnetic and hadron compartments of a tower. 
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provides a measurement of electromagnetic shower energy with the resolution 
of SE/E = 0.23 for showers due to energetic photons or electrons. Because 
pulse heights of the one- dimensional strip and wire clusters reconstructed for 
the same shower are typically within m 7% of each otheiQ multiple show- 
ers within a single CES chamber can typically be correctly reconstructed 
by matching the ID strip and wire clusters using their pulse heights. The 
much broader hadronic showers frequently extend over multiple CHA towers 
and their spatial position can only be inferred from the energy measured in 
each tower. Early hadronic showers can deposit part of their energy in CEM 
and produce signals in CES, which sometimes complicates reconstruction of 
CES clusters, e.g. if overlapping with showers produced by photons (from 
7T° —7- 77) in the same jet. 

Let us consider a relatively simple example of a jet containing a charged 
pion 7r + , a neutral pion 7r° decaying to two unresolved photons 7172 (deposit- 
ing energy in a single tower), and possibly a neutral hadron n. While the ir + 
momentum is known from the tracker, energy estimation for neutral particles 
relies on the calorimeter measurement. However, the energy registered in the 
electromagnetic and hadronic parts of the calorimeter, E™ as and is 
a sum of the unknown deposits by each of the particles in the jet, including 
that by the charged pion: 

jpEM rpEM 1 jpEM , jpEM /n\ 
h meas = ^tt+ + ^7172 + Ai K 6 ) 

resulting in an under-constrained system with two equations and six un- 
knowns. Because the leakage of the electromagentic showers from photons 
into the hadron calorimeter is typically small, as illustrated in Fig. pTa) show- 
ing E EM vs. E HAD for simulated electrons, the corresponding term (E^®, 
shown in parentheses in Eq.Q, can be neglected. While it reduces the num- 
ber of unknowns, solving the system of Eqs. (3j4) requires disentangling 



contributions from hadronically interacting particles. While E E + I and E^ D 
terms are correlated with the accurately measured momentum of ir + , the cor- 
relation is not trivial as illustrated in Fig. [ljb) showing the 2D distribution 
of E EM vs. E HAD for a simulated sample of charged pions with p n + = 25 



The CES energy resolution is driven by the fluctuations in the amount of ionization 
produced inside the CES chambers and not by the measurement of the charge collected 
on strips and wires 
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GeV/c. The complex shape of the dependence owes to the large fluctuations 
in the development of hadronic showers and the non-compensating nature of 
the CDF calorimeter. Because E^+* cannot be reliably estimated, and E„ 
is completely unconstrained, the momentum of the 7r° cannot be calculated 
directly. Estimating the jet energy directly in the PEA approach is therefore 
hampered by two issues: (i) difficulty in estimating E EM for hadronically 
interacting particles required to evaluate 7r° momentum, and (ii) difficulty 
in estimating E^ D required to estimate momentum of n. Measuring mo- 
mentum of a combined tt° + n system, e.g. by "guessing" the charged pion 
energy depositions and assigning the rest to the it + n system is nearly 
exactly equivalent to measuring jet energy using the calorimeter only thus 
negating all advantages of the PFA technique. 
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Figure 1: Examples of the calorimeter response for (a) simulated isolated electrons with 
p = 25 GeV/c and (b) simulated isolated charged pions with p — 25 GeV/c in the plane 
E EM versus E HAD . The size of the boxes in the plots is proportional to the probability 
density; the shaded area indicates the area of the highest density as obtained from the 
same distribution plotted with finer bin size. 

An algorithm based on solving Eqs. 
of reconstructing hadronic tau jets was implemented in the "tracks+7r°'s" 
algorithm at CDF and used in the early Run-II analyses. The idea was to 
simplify the problem by assuming the absence of neutral hadrons and es- 
timate E^ 1 as an average energy deposition in the EM calorimeter for a 
charged pion with given momentum (measured in the tracker). Then the 
remaining portion of the measured electromagnetic energy can be taken as 
the energy of the vr° (Eq.Q). Alternatively, one can assume that charged 



3|4) directly with a specific purpose 
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pions always behave in the electromagnetic calorimeter as minimal ionizing 
particles. While delivering a significant improvement over the calorimeter- 
only measurement for a large fraction of events, the algorithm featured long 
tails in the energy resolution. These tails have been traced to jets with sev- 
eral particles depositing energy in the same calorimeter tower. In physics 
analyses, undermeasuring energy of quark or gluon jets containing neutral 
hadronically interacting particles also leads to an increase in background 
contamination. Additional corrections based on detecting incompatibilities 
of the reconstructed energy with the initially unused Eq.Q or gross dis- 
agreements with the low resolution measurement of 7r° energy in the Shower 
Maximum detector allow for reduction of the tails in the energy resolution. 
However, the ad-hoc nature and complexity of the corrections, as well as the 
algorithm's inability to consistently treat correlations and incorporate other 
available measurements motivate developing a more comprehensive method. 

3. PPFA: The Probabilistic Particle Flow Algorithm 

The challenge of solving an underconstrained system with significant cor- 
relations and additional redundant measurements outlined in previous sec- 
tion can be addressed with a probabilistic approach. For every hypothesis 
of the jet particle content (the number of particles of each type), one can 
define a probability estimator (likelihood) for a set of particles of given type 
and momenta to result in a particular set of detector measurements. These 
measurements could represent energy counts in calorimeter towers, cluster 
energies, track momenta or any other available measurement. The likelihood 
can be written as follows: 

C(p\E meas )= ! ]J MiEl.^Ei^E^.^E^J xV r ,(Ei\ Pi )dEi, (5) 

j=det 

where index % runs over particles in a jet [i = 1, i p ), Pi is the true momen- 
tum of particle i, E^^ stands for each available measurement (j = 1, j m 
runs over all available measurements), Vij{E{\pi) is the "response function" 
for particle i with true momentum pi to produce a contribution Ej to a 
measurement j, and the matrix Ai contains information about correlations 
between contributions of each particle to each measurement. One example 
of the latter is the correlation between the the deposits of energy E l - in an 
electromagentic calorimeter tower j by all particles crossing it giving a term 
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S(y~] E\ — E 3 meas ) in M. Another example is the correlation between the 

i 

energy deposits by particle % in the electromagnetic and hadron calorime- 
ter clusters (or towers) ji and J2 it crosses, in which case the corresponding 
term may have a fairly complex form f(E l j i ,E l j 2 ). If such global likelihood 
function were constructed, jf corresponding to its maximum will determine 
the most probable set of particle momenta, thus achieving the goal of fully 
reconstructing the event using all available detector information. The type 
of each particle and their number can be taken as parameters of the global 
likelihood allowing one to also determine the most probable particle content 
of a jet. 

While building a global and fully inclusive likelihood is certainly possi- 
ble, it is hardly practical. However, this approach can be deployed to solve 
specific problems like measuring jet energy in environments with frequent 
energy overlaps in the calorimeter. Here, we will descibe an example of one 
such possible PPFA implementation. For simplicity, this example will use the 
energy of pre-reconstructed electromagnetic and hadronic calorimeter clus- 
ters as the basic measurements E^^, but an implementation using tower 
energy measurements would be very similar. The PPFA probability for a set 
of particles with momenta pi to produce a set of calorimeter measurements 
E 3 meas for each cluster j in electromagnetic and hadron calorimeter can be 
written as follows: 

Cptf \Emeas) = j ]J *(£ E> - EP me JMV ij (E' i \p i )dE' i1 (6) 

j=clusters i 

where p is the vector of particle momenta p iy index % runs over the list of 
particles in a jet, index j runs over the available measurements (in our ex- 
ample, the electromagnetic and hadronic calorimeter's cluster energy mea- 
surements), El is the energy deposited by i th particle in cluster j, E^^ is 
the measured energy for cluster j, and the response function Vij{E\\pi) is 
the probability for particle i with true momentum pi to deposit energy E\ 
in cluster j (Vij depends on the type of particle), M. is a yet undefined cor- 
relation matrix. The likelihood C p is essentially a sum of probabilities of all 
possible outcomes (specific values of energy deposited by each particle in the 
electromagnetic and hadronic calorimeter clusters) consistent with the actual 
cluster energy measurements. The probability of each outcome is a product 
of probabilities for each particle to deposit a given amount of energy Ei 
in the hadronic and electromagnetic calorimeters, given their assumed true 
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momenta pj. In this example, Ai in Eq.(|6]) is needed to account for the cor- 
relation of deposits by the same particle in the electromagentic and hadronic 
calorimeters, e.g. early showering of a charged hadron leads to larger depo- 
sition in the electromagnetic calorimeter and reduced energy deposit in the 
hadron calorimeter. The easiest way to take this into account is to switch to 
two-dimensional response functions V CAL {E i n ,E i 32 \pi), where ji and 
j2 are the indeces of the electromagentic and hadronic calorimeter clusters 
the particle traverses. In this schema, the distributions previously shown in 
Fig. [IJa) and (b) can be normalized and used as V CAL (E EM , E HAD \p) for 
electrons and charged pions, respectively. 



800 
700 

600 E 



e 

s- S00 

CD 

« 400 

CD 

■E 300 

100 





Simulated ShowerMax response 
for isolated electrons 
5<p<6 GeV/c 



10000 
c 8000 - 
CD 6000 - 



2 4 6 



10 12 14 16 18 20 22 24 



1 Simulated ShowerMax response 
to isolated electrons 
25<p<26GeV/c 




100 



Cluster Pulse Height (arbitrary units) 



Cluster Pulse Height (arbitrary units) 



Figure 2: Examples of the Shower Maximum detector response functions for simulated 
isolated electrons with momenta in the ranges p = 5 — 6 GcV/c (left) and p = 25 — 26 
GeV/c (right). Similarity in the response between electrons and photons allows using 
these functions in constucting likelihood functions for either electrons or photons. 

Additional measurements can be easily incorporated by modifying the 
likelihood function with Bayesian-like "priors". For example, information 
from tracking or Shower Maximum detectors can be added by multiplying 
the initial likelihood function by a probability to measure a certain track 
momentum or pulse height given the assumed true momentum of a charged 
pions, electrons or photons. For example, distribution shown in Fig. [2] upon 
normalization can be used as the response functions of the Shower Maximum 
detector V^ E (E |p 7 ) for photons with momenta ranges p = 5 — 6 and 
25 - 26 GeV/c. 

The most probable set of particle momenta p° is obtained by maximizing 
the likelihood C p (p \E meas ). The likelihood shape in the p space can be used 
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to evaluate the uncertainty in the energy of each particle. If one primarily 
seeks to measure the energy of the entire jet, one can use the likelihood to 
obtain a "posterior" distribution for the jet energy, defined as a sum of the 
energies of the consitituent particles. 

f r 

C E (E jet \E meas ) = / C p (p\E meas ) x 8 pi- E jet )dp, (7) 

"* i=i 

as in the presence of correlations the latter provides a more convenient es- 
timate of the jet energy and its uncertainty. The shape of the jet energy 
"posterior" allows estimating the uncertainty in the measured jet energy. 

Once the most likely set of particle momenta p° is found, one can further 
test the "goodness" of the particle hypothesis. We define a p-value as the 
probability to observe a combination of detector measurements E' meas that is 
equally or less likely than the actual set E meas observed in the event, given 
that the true combination of particles and momenta is the one that maximizes 
the likelihood in Eq. (|6]): 

/ £p(jP \E meas )dE meas 
p _ J £p(iP \E' meas )<c p {p^ \E meas ) ^ 

£p(p \E meas )dE meas 



In practice, the p-vaue can be easily calculated by generating "pseudo-experi- 
ments," in which one generates "pseudo-deposits" of energy by each particle 
with momenta p® towards each cluster energy measurement using the same 
response functions. The sum of the deposits of all particles crossing particular 
clusters yields a set of pseudo- measurements E' meas . The probability of the 
generated outcome is given by C p , and the integrated probability of observing 
equally or less probable set of measurements gives the p-value. A too low p- 
value may indicate that the initial hypothesis should be modified. Note that 
interpreting measured p- values has to be done carefully as arbitrary addition 
of new particles to make the observed calorimeter response "perfect" may 
degrade the resolution by biasing the measurement towards the calorimeter- 
based jet energy measurement. 

4. PPFA Implementation for Hadronic Tau Reconstruction at CDF 

In this section we describe a practical implementation of the method de- 
veloped for hadronic tau jet reconstruction at CDF. In the following, we 
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discuss the CDF baseline hadronic tau jet reconstruction, which is used as a 
starting point for the algorithm. We then discuss the PPFA strategy, mea- 
surement of the response functions, mathematical definition of the PPFA 
likelihood function and the "p-value," and the algorithm used for correct- 
ing the initial particle hypothesis. We conclude with evaluating algorithm's 
energy resolution using simulation. 

4-1. Baseline Hadronic Tau Jet Reconstruction at CDF 

A hadronic tau jet candidate is defined as a narrow cluster of calorimeter 
energy with a seed tower of E T > 5 GeV/c and at least one track with 
p T > 5 GeV/c pointing to the cluster. The narrow custer is defined as a 
cluster with no more than six contiguous calorimeter towers with Et > 1 
GeV/c and is reqiured to be in the central part of the detector < 1) 
to ensure high tracking efficiency Given the size of the CDF calorimeter 
towers of Ar] x A0 ~ 0.1 x 0.25, the efficiency of the calorimeter-related 
selection is very high for hadronically decaying taus with visible pt > 10 
GeV/c. The seed track pt requirement brings a non-negligible inefficiency 
for tau jets of low-to-moderate visible momentum, but its strong power in 
rejecting quark and gluon jet backgrounds made it a standard in all CDF 
analyses involving hadronic tau jets. Next, all tracks within a signal cone of 
AR = a/A0 2 + Ar/ 2 < 0.17 around the seed track are associated with the 
tau candidate. 

4-2. Implementation Strategy 

The likelihood-based PFA algorithm starts with the initial hypothesis 
that every reconstructed track is a charged pion, every reconstructed cluster 
in the Shower Maximum detector with no track pointing to it is a photon, 
and no other particles are present in the jet. While this initial hypothesis can 
be corrected at a later point in the algorithm, in most cases it turns out to be 
true owing to the low rate of the track and Shower Maximum reconstruction 
failures and the low branching fraction of hadronic tau lepton decays for 
modes with neutral hadrons except 7r°'s, e.g. t^-Kl + X. Next, we define 
the probability function using pre-calculated response functions (details for 
both are discussed in the following two sub-sections) and perform a scan in 
the multi-dimensional parameter space of momenta of the particles, assumed 
to comprise the hadronic tau jet, searching for the maximum of the likelihood 
function. 
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After the most likely combination of particle momenta is determined, we 
construct the "p-value" which measures the probability that given particle 
content and momenta hypothesis result in detector measurements less or 
equally as likely as the observed response. If the p-value is too low, the 
particle content hypothesis is modified by adding a photon, which is assumed 
to be not reconstructed either due to the detector inefficiency or an overlap 
with a track (Shower Maximum cluster will be vetoed if it is reconstructed too 
close to the extrapolated position of a charged track), and the full calculation 
is repeated. If the p-value remains too low, the particle content is modified 
by adding a stable neutral hadron (K L ) and the likelihood calculation is 
repeated. The procedure iterates until an acceptable outcome is achieved or 
after running out of the pre-set options. 

4-3. Response Functions of the CDF Detector Sub-systems 

As discussed earlier, the relevant detector measurements include track- 
ing, measurements of energy deposited in the electromagnetic and hadronic 
calorimeter towers, and the measured CES cluster energy. Because the preci- 
sion of the CDF tracking is much higher than the acuracy of other measure- 
ments, the tracker response function for charged pions as a function of pion 
momenta can be safely approximated by a delta function to simplify further 
calculations. To determine the calorimeter response functions for charged 
pions, we use CDF GEANT-3 [S] based simulation package tuned using the 
test beam data. Isolated charged pions are selected using hadronic tau decays 
— \'K~ sl v T from an inclusive Z/7*— >tt simulated sample of events generated 
with Pythia [10]. We calculate response functions for charged pions with 
momenta ranging from 1 to 100 GeV/c in steps of 1 GeV/c. Large fluctu- 
ations in the development of hadronic showers and their large lateral size, 
frequently spanning across several CHA towers, make it impractical to cal- 
culate responses separately for each tower in a multi-tower cluster. Instead, 
we measure the hadronic calorimeter response for charged pions by summing 
tower energies in a square of 3 x 3 towers centered on the extrapolated posi- 
tion of the 7r track. In the CEM, hadronic showers rarely deposit energy in 
more than a single tower, therefore charged pion electromagnetic deposition 
is calculated using the energy in the tower pointed at by the track associated 
with 7r =l= . To take into account the strong correlation of the energy depositions 
by the same particle in CEM and CHA, we define a 2-dimensional response 
function in the E EM versus E HAD plane. Fig. []Jb) shows an example of the 
calorimeter responses in CEM and CHA for simulated isolated charged pions 
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with momenta 25 < p n < 26 GeV/c. We verify the accuracy of response 
functions obtained using simulation by comparing them with those obtained 
in a pure sample of isolated charged pions in data. When normalized to 
unity, these response functions represent the probability density functions 
(PDF) for a charged pion with a particular momentum to produce a given 
response in the calorimeter, which we will refer to as 'P^al^em ^ E HAD \p n ). 

The vast majority of photons in tau jets originate from 7r°— yyy and typi- 
cally have energy of the order of a few GeV, making accurate understanding 
of the calorimeter response for low energy photons particularly important. 
While the response functions for photons can be measured directly from 
the simulation, validating them with the data can be difficult owing to the 
challenges in selecting a high purity sample of low energy photons in data. 
Fortunately, the calorimeter response to photons and electrons is nearly iden- 
tical, allowing for use of a relatively high purity sample of electrons in data 
obtained by tagging photon conversions. Similar to the case of charged pi- 
ons, we calculate 2-dimensional response functions for photons with momenta 
ranging from 1 to 100 GeV/c in steps of 1 GeV/c in the E EM versus E HAD 
plane. Fig. [T|a) shows an example of the calorimeter response function for 
photons with the true momenta 25 < p 7 < 26 GeV/c. We denote the re- 
sponse functions of this type as V^ AL (E EM , E HAD \p y ). 

As mentioned earlier, the CES energy measurement is used in the likeli- 
hood function as, despite its modest resolution, it can help correctly assign 
energies in difficult cases. Because photon candidates reconstructed in CES 
have highly correlated strip and wire pulse heights, we only use the strip 
based measurements to determine the energy of a given CES cluster. Exam- 
ples of the CES response functions V^ ES \E CES \p 1 ) for isolated photons with 
energies 5 < p 7 < 6 GeV and 25 < p 7 < 26 GeV are shown in Figs. |2^a) and 
(b), respectively. 

4-4- Computation of the PPFA Likelihood 

As mentioned earlier, the initial particle hypothesis assumes that each re- 
constructed track is a charged pion and each reconstructed CES cluster not 
associated with a track is a photon (or perhaps two merged photons, which 
makes little difference). The tracking momentum measurement is taken to be 
exact due to the superior resolution of the CDF tracker. To include calorime- 
ter measurements, the highest px track associated with a tau candidate is 
extrapolated to the CES radius and the corresponding calorimeter tower be- 
comes a seed tower. A grid of 3x3 towers is formed around the seed tower. 



14 



Each track and CES cluster is associated to one tower on the grid. Each 
electromagnetic tower provides its own measurement E™ as (components of 
this vector will be denoted as E mea 3 s j = 1, 9) used in the likelihood. For 
the hadronic calorimeter, we sum energies of all nine towers into a single 
measurement, = Y^Emeas 3 , for the entire "super-cluster". In assump- 

tion that decay products of a tau jet are charged tracks and photons, the 
likelihood function has the following form: 

r ^7 N n N n 

£M,p,Me*1,e™°J™°) = / U d K AD U dE ^ D Il d K AD >< 

J i=i k=i i=i 



9 ^7 N„ N n 

iHAD\ 
J meas 

=1 1=1 k=l k=l 

N„ N n 



X 



s{£ K Mi + E K Mi + E E l M - E ^s) >< ^ AL (Ef\E^ D \ Py : 

i=l k=l k=l 

jCAL t rriEMj jjiHAD | \ „ <r>CES ( jrCES 



xPr(C^<^bJ x C(C»W. (9) 

where the integration runs over all possible depositions of energy by each indi- 
vidual particle in each available calorimeter measurement, the delta functions 
in the second line ensure that the sum of the deposits for each measurement 
is equal to the observed value, and the third line includes response functions 
for photons and charged pions in the calorimeter and in the CES detector. 
One can choose to convert Eq.(|9]) into a posterior probability distribution to 
estimate the hadronic tau jet energy as: 

r (F- \F EM pHAD pCES\ ( r (<n r> \F EM F HAD F CES \ 

N-, N v N n 

x5 (E^ + E^ + E Pn ' _ E jet )dp v dp^dp n (10) 



k=l 1=1 



While the integral form presented in Eqs.(9jl0) appears fairly compli 



cated, it is straightforward to implement in the code and compute numer- 
ically using the Monte Carlo integration technique. Values of p^ k and p 7i , 
which maximize £(p„-,p 7 ) in Eq.([9]) represent the best estimate on energies 
of particles produced in the tau decay under the assumption that the initial 
hypothesis about the particle content was correct. Figure [3] shows examples 
of the CE^Ejet) distributions for two representative events from a sample of 
simulated Z^-tt events. 
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Figure 3: Examples of CE(Ejet) for two representative simulated Z — > tt events plotted 
versus combined energy of the photon candidates in the jet related to Ej et via ^P-y = 

Ejet — J2Pt- 

4-5. The Reduced p- Value Definition 

Photon reconstruction failures or presence of a stable neutral hadron, 
e.g. K®, may lead to an incorrect initial particle hypothesis. Such occu- 
rences result in suboptimal estimation of energy, therefore it important to 
detect and correct such cases. We define a p- value using Eq.(|8]), but, to 
speed up the calculations, we do two simplications to the definition of the 
likelihood C p in Eq. First, because in practice most of the cases affected 
by the incorrect initial hypothesis can be identified through inconsistencies 
between the available calorimeter and tracker measurements, we drop the 
terms associated with the CES. Second, we combine the nine electromagen- 
tic towers in the hadronic tau cluster into a single "super-tower" with energy 
e em = J2E EM i, and define the "reduced" version of Eq. Q: 

£' P (p\{ETas>E™ D s ) = [ f[dE E ^ x £ P (p\E™ s ,E™°)x 

9 

X(\~~^ pEMi _ rpEM \ 
\ / j meas raeasl \ I 

1=1 

We then define the "reduced" p-value according to Eq. ^ using the 
reduced £' . This p-value quantifies how frequently a set of particles with 
true momenta fP can produce a set of measurements equally or less probable 
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than the one observed in data. The p-value is sensitive to inconsistencies in 
the available calorimeter measurements and can be used to detect mistakes 
in the initial particle content hypothesis. Figure Efa) shows the distribution 
of the reduced p-value for all reconstructed hadronic tau jets in the sample of 
simulated Z^-tt events. The p-value is plottedas a function of the relative 
difference between the reconstructed visible tau jet energy at the maximum 
of the likelihood function and the true visible jet energy obtained at the 
particle generator level. It is evident that a vast majority of mismeasured 
jets have very low reduced p-value. As it will be shown next, most of these 
mismeasurements owe to the incorrect initial particle hypothesis. 
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Figure 4: Z — » tt events in CDF II detector simulation: 1-prong taus with no photon 
candidate reconstructed by CES. Left: p-value versus relative energy mismeasurement 
R{jh). Right: J2(r/j) for events with small p-value before correction (dashed black line) 
and after correction for missing photon (full blue line). 



4-6. Corrections to the Particle Content Hypothesis 

Based on the simulation studies, the majority of mismeasurements owing 
to the incorrect initial particle hypothesis fall into two categories. The first 
category includes tau jets with one charged pion and typically one tt , where 
none of the photons were reconstructed in the CES. This can happen for 
one of the following three reasons: (i) a simple CES reconstruction failure 
(either dead channels or a photon mostly properly registering in the EM 
calorimeter but landing outside the fiducial volume of CES), (ii) the CES 
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cluster is vetoed due to being too close to the extrapolated track position, 
or (iii) photon(s) falling into the cracks between the calorimeter 0-wedges. 
The last case is likely impossible to correct as the deposited electromagentic 
energy is highly sensitive to small differences in the electromagnetic shower 
development. In addition, photons hitting the cracks may deposit a substan- 
tial portion of their energy in the hadron calorimeter. All three cases lead 
to a substantial underestimation of the tau jet energy as only momentum of 
the track would count towards the measurement. To correct for this effect, 
we apply the following procedure: if a tau candidate with a single recon- 
structed track and no reconstructed photons has a too small reduced p- value 
(p < 0.005), we first attempt to correct it by introducing an additional pho- 
ton. As no CES measurement is available for this photon, the term with 
jyCES j n Eq.Q is removed and the likelihood function with modified parti- 
cle hypothesis £(p 7r ,p 7 ) (or the corresponding Ce) is recalculated. The new 
energy is taken as the updated energy of the tau jet. Figure |4|b) shows the 
relative difference between the reconstructed and the true values of the jet 
energy for these jets before and after the correction. While the improvement 
is evident, the catastrophic cases where photons hit the cracks between the 
calorimeter wedges cannot be fully recovered contributing to reduced reso- 
lution. Another contribution, which makes the distribution broader, comes 
from events in the second category which are discussed next and can be 
corrected. 

Tau jets with one charged hadron and a stable neutral hadron (kaon), 
which is not included in the initial particle content hypothesis, typically have 
an excess of energy measured in the hadron calorimeter compared to what 
one would expect from a single charged pion. Because the excessive energy 
in the hadron calorimeter detected using the p- value cannot be accounted for 
by adding a photon at the previous step, the p-value for these jets remains 
small after correcting the initial particle content hypothesis for a photon, as 
shown in Fig. [5^a). Therefore, for jets with exactly one reconstructed track 
and no reconstructed photons that had a low initial p-value (p < 0.05) and 
continued to have a low p-value after the photon correction (the threshold is 
p < 0.03), the particle content hypothesis is modified to contain one charged 
pion and one neutral kaon. Technically, it is accomplished by adding a term 
V% AL (E EM , E HAD \p n ) = v^ AL (E EM , E HAD \p n ) (as the calorimeter response 
for charged pions and neutral hadrons is very similar) in Eq.([9]), and adjusting 
the argument of the delta-functions to include a new particle. The energy of 
the tau jet candidate is updated with the energy obtained from maximizing 
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Figure 5: Z — > rr events in CDF II detector simulation: 1-prong taus with no photon 
candidate reconstructed by CES and p — value uncor < 0.005. Left: p — value after photon 
correction versus R(rh)- Right: R(rh) for events with small p — value 1 ~ cor before kaon 
correction (dashed blue line) and after correction for kaons (full red line). 

£ p (p n ,p n ) (or the corresponding C E ). The relative difference between the 
reconstructed and the true tau jet energy before and after the correction for 
this class of jets is shown in Fig. [5](b) . 

4-7. PPFA Energy Resolution 

Figures [6]^a) and (b) show the relative difference between the PPFA re- 
constructed tau jet transverse momentum and the true visible transverse 
momentum obtained at generator level for one and three-prong hadronic tau 
jets. For comparison, the same plots show the performance of the standard 
CDF tau reconstruction (see [11] for details) shown as dashed line using the 
same simulated Z^-tt events. It is evident that the PPFA algorithm has 
been able to converge to the correct energy without resorting to complex 
ad-hoc corrections used in the standard CDF reconstruction. The improve- 
ment is particularly striking in cases with significant energy overlaps, as 
illustrated in Fig. [6|c), which shows the same distribution, but for one-prong 
events containing at least one photon pointing to the same calorimeter tower 
as the track. 

To quantify the level of improvement, we use the fraction of jets with the 
reconstructed energy falling within 10% of the true jet energy, denoted as 
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Figure 6: Comparison between reconstructed transverse momentum and true transverse 
momentum of the hadronic tau for Z — > tt events in CDF II detector simulation. The 
red solid line corresponds to the likelihood method, the black dashed line corresponds to 
standard CDF tau reconstruction, (a): events with 1-prong tau; (b): events with 3-prong 
tau; (c): events with significant energy overlap where 1-prong tau is required to have a 
track and at least one reconstructed CES cluster in the same calorimeter tower. 

po% in Fig _ g on average, the PPFA increases f 10% by about 10%. PPFA 
jet energy resolution distribution also has a more symmetric shape around the 
true energy and a reduced tail due to jets with underestimated reconstructed 
energy. The improvement in the tail behavior is more pronounced for one- 
prong jets as one-prong taus more frequently contain neutral pions with 
significant contribution towards the total visible jet energy 

5. PPFA Performance in Data 

While the simulation studies show that the PPFA provides an accurate 
measurement in a single, self-consistent framework free of complex ad-hoc 
corrections, it is important to validate the algorithm performance in a real- 
istic analysis setting using actual data. Energy resolution for hadronic tau 
jets cannot be evaluated directly using data. Unlike the case of Z— >ee or 
Z^r[i[i events where lepton momentum resolution can be inferred from the 
broadness of the dilepton mass spectrum, there is no such "standard candle" 
for taus at hadron colliders. In the case of Z— >tt, which is the only fairly 
clean physics signal enriched with true taus accessible at hadron colliders, 
the shape of the invariant mass of visible tau decay products is very broad 
as partial cancellation of the missing transverse energy associated with 
momenta of the neutrinos from tau decays precludes reconstructing neutrino 
momenta. In addition to the improved energy resolution, the PPFA can po- 
tentially deliver other advantages, e.g. a better discrimination against QCD 
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Figure 7: Kinematic distributions demonstrating the purity of the clean tau sample after 
Z^rTT^rlT^vvv (I = e or /i) events are extracted from CDF data with tight selection 
requirements: (a) transverse momentum of the light lepton, (b) transverse momentum of 
visible decay products of the hadronically decaying tau lepton, . 

multi-jet backgrounds due to "sharper" shapes of identification variables and 
the new PPFA specific handles, such as the estimate of the jet energy un- 
certainty on a jet-by-jet basis and the p-value. However, as many of these 
potential improvements are correlated, disentangling and quantifying each 
of these potential improvements separately is not practical. Incidentally, a 
sample of hadronic taus with purity suitable for such studies would have 
insufficient statistics due to very harsh cuts required to reduce background 
contamination. 

Given the above limitations, we validate the PPFA in a realistic data 
setting and evaluate its performance as follows. First, we demonstrate that 
the PFFA-based tau jet energy measurement in the data is well described 
by the simulation. Similarly, we show that the PPFA p-value is well repro- 
duced in the data. Second, we study the tau jet invariant mass distribution 
for events with tau decays dominated by r— >pv— >7T + 7T°b> and compare the 
PPFA-based measurement with that obtained using standard CDF recon- 
struction. While such invariant mass is only moderately sensitive to the jet 
energy resolution, this test allows an indirect validation of the PPFA jet 
energy resolution and a comparison with the standard CDF reconstruction. 
Finally, as a qualitative demonstration of the PPFA potential for enhanced 
background discrimination, we perform two side-by-side proto-analyses using 
similar data selections that rely on discriminators provided by the PPFA in 
one case and the standard CDF reconstruction in the other. 
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Figure 8: Distribution of hadronic tau candidate p-value for events passing selection re- 
quirements in data (points) compared to the sum of background and signal predictions. 
Left : 1-prong taus. Right: 3-prong taus. 



5.1. Validation of the PPFA Reconstruction Using Z^-tt Data 

We use a fairly clean and well understood sample of Z— t-tt events col- 
lected by CDF in Run-II in the channel where one tau lepton decays hadron- 
ically (r — > ThU T ) and the other decays to a light lepton (r— >lv T i>i where I 
stands for an electron or muon). Selections require a tightly isolated recon- 
structed muon or an electron with 20 < p T < 40 GeV/c and a loose hadronic 
tau jet candidate. Tau jet candidates are required to have a seed track with 
Pt > 10 GeV/c; no explicit requirement on the full momentum of the jet is 
applied to exclude biases owing to the choice of a tau energy reconstruction 
algorithm. Several event topology cuts are applied to reduce contamina- 
tion due to cosmics, Z/7*— >ee, Z/7*— and W^+jets events. A full list 
of selections is available in [T2]. The remaining QCD multi-jet background 
is estimated from data using events with lepton and tau candidates having 
electric charge of the same sign. We rely on simulation to estimate Z/7*— >tt 
, Z — )■ ee, Z —¥ fifi and W + jets contributions. These processes are gener- 
ated using Pythia Tune A with CTEQ5L parton distribution functions [13] 
and the detector response is simulated using the GEANT-3 package 0. 

Once the sample is selected, the PPFA reconstruction is performed in 
data and simulation. A thorough comparison of kinematic distributions sen- 
sitive to the hadronic tau jet energy measurement has allowed us to make 
a conclusion that PFFA performance in the data is well described by the 
simulation. As an illustration, Figs. CTa) and (b) show lepton momentum 
and PPFA-based hadronic tau jet momentum distributions for the selected 
Z/7*— >tt candidate events to demonstrate the good agreement between data 
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and simulation, as well as to give readers a feel of the purity of the sample 
used. 

As for the new handles made available by the PPFA, we particularity 
studied the reduced p-value, which quantifies the level of consistency of the 
contributing calorimetric measurements with the hypothesis maximizing the 
PPFA likelihood. Despite its seeming complexity, the distribution for the 
reduced p-value is well described by the simulation. Figure [8] shows the 
distribution of the PPFA p-value for selected hadronic tau candidates with 
one or three charged tracks. Apart from the good agreement between the data 
and simulation, it is evident that the reduced p- value provides discrimination 
against the jets from multi-jet QCD events and can be utilized in physics 
analyses to improve the purity of selected data. 

5.2. PPFA Energy Resolution 

As discussed earlier, a direct measurement of the energy resolition for 
hadronic tau jets using data is not possible as the presence of multiple neu- 
trinos in the event precludes reconstruction of the Z boson mass. Partialy re- 
constructed mass definitions, e.g. the transverse mass of the lepton, hadronic 
tau jet and the missing transverse energy, all result in broad shapes owing 
to the unreconstructed neutrinos. The width of these distributions is nearly 
independent of tau jet energy resolution^ precluding quantitatve estimations 
of the latter from data. 

Although only modestly sensitive to the accuracy of the jet energy mea- 
surement, the reconstructed invariant mass of the constituents of a tau jet can 
be used for qualitative comparisons. In particular, a significant fraction of 
one-prong tau jets is produced in decays —¥ u T p ± (770) — > i/r^. In these 
decays, the invariant mass of the hadronic tau jet should be consistent with 
the mass of p-meson and the width of the distribution is sensitive (although 
somewhat weakly) to the resolution of the hadronic tau jet energy measure- 
ment. Figures [9]^a) and (b) show distributions of the invariant mass of the 
one prong tau candidates reconstructed in the data using the PPFA approach 
and the standard CDF tau reconstruction, with the simulation predictions 
overlaid. Note that the pedestal near m = 0.14 GeV/c 2 is due to tau jets 
with no reconstructed photons, which includes 7r°-less one-prong tau decays 



6 Even in more advanced approaches designed to improve mass reconstruction for ditau 
resonances, e.g. the MMC technique [12], the resolution is still dominated by the accuracy 
of the missing transverse energy measurement. 
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Figure 9: Distribution of the invariant mass for reconstructed hadronic tau candidates 
in the clean Z^tTT^tlTh.vvv (I = e or fi) sample extracted from CDF data using tight 
selection requirements: comparison between PPFA reconstruction (a) and standard CDF 
reconstruction (b) for all 1-prong tau candidates. 



as well as the cases with photon being unreconstructed. While these com- 
parisons do not allow quantifying improvement in jet energy measurement 
resolution, it is evident that the PPFA technique provides a better measure- 
ment of the tau invariant mass. Similar improvements can be expected for 
other measurable quantities related to particle and energy flow within a tau 
jet candidate. As tau identification mainly relies on exploring differences in 
particle and energy flow properties between narrow tau jets and the broader 
generic jets from the QCD multi-jet backgrounds, such improvements have 
a potential of improving rejection of multi-jet backgrounds. 

5.3. Tau Identification and Background Discrimination 

While the primary goal of the PPFA is an accurate jet energy measure- 
ment in the high occupancy environment, it also provides additional tools 
that can be used in physics analyses to improve discrimination against back- 
grounds. Improved accuracy of the measurements of energy, particle and 
energy flow properties, as well as the new PPFA-specific handles, such as 
the p- value or the jet-by-jet energy measurement uncertainty, can all aid in 
discriminating hadronic tau jets from multi-jet QCD backgrounds. To il- 
lustrate it, we model two simple proto- analyses, both aiming to maximize 
the signal to background ratio for a sample of Z^-tt candidate events by 
exploiting properties of the tau jet candidates. One of the analyses relies 
on variables calculated using standard CDF reconstruction and the other 
one relies on the PPFA calculations. Both analyses start with a sample of 
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Figure 10: 1- and 3-prong taus in the QCD enriched sample. Data (points) compared 
to the sum of background and signal predictions: (a) transverse momentum of visible 
decay products (b) hadronic tau visible invariant mass, (c) p-value distribution, (d) A8(t) 
distribution. 



candidate Z^-tt events with a fairly typical for physics analyses level of 
background contamination due to the QCD multi-jet event^J Compared 
to the high purity sample, the "realistic" sample is obtained by loosening 
isolation and some other tight quality requirements on the lepton leg and 
removing the requirement on the absence of additional energetic jets in the 



event. Purity and composition of this sample can be inferred from Fig. [10 
showing several kinematic and jet shape variables. 

Table [T] describes the selections applied. Momentum thresholds and seed 
track px requirements are chosen to select a sample with an acceptable level 



The clean Z— >tt sample used so far features extremely tight lepton leg selections 
designed to achieve a high purity source of hadronic taus. While effective in reducing 
multi-jet backgrounds, such selections are not typical of physics analyses due to their very 
low signal efficiency. 
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Table 1: Selections used in the two proto- analyses using either PPFA or standard CDF 
selection for hadronically decaying tau jets. The first group of selections corresponds to 
standard CDF selections applied first in both analyses. The second group shows additional 
non-standard selections using the invariant mass and the narrowness of the tau candidate's 
jet cluster that can be applied to both analyses. The last selection uses the PPFA p-value 
and is only applied to the PPFA proto- analysis. 

1-prong 3-prong 
p T > 10 GeV/c p T > 15 GeV/c 

p seedtrk > GeV / C p seedtrk > 1Q GeV / C 



N trk = n N trk = 

iso cone iso cone 

< m(r) < 0.25 or 
0.375 < m(r) < 1.4 GeV/c 



2 0.8 < m(r) < 1.4 GeV/c 2 

A9 < 0.04 A9 < 0.015 

Only the PPFA-based analysis: 

p > 0.008 if p T < 20 GeV/c p > 0.06 if p T < 30 GeV/c 



of background while not being specific to either the PPFA or the standard 
CDF reconstruction. N^ cone is the number of tracks with pr > 1 GeV/c in 
the isolation cone. A9{t) = Ei x 9i/ ^2 &i * s the weighted angular width 
of the jet calculated using the momenta of individual particles reconstructed 
in a jet, similar to the case of the previously discussed jet invariant mass 
m(r). The summation goes over particles in the jet, E{ being the particle 
energy and 9i is the angle between the particle and the visible 4-momentum 
of the tau jet. The specific cut choices for A9(t) and m(r) (see Table Q 
aim at a high signal efficiency while rejecting the tails of the corresponding 
distributions dominated by the background events. These cuts are there- 
fore expected to reduce background contamination, but are not optimized in 
any particular way. The selections discussed above can be equally applied 
to both the standard and the PPFA-based analyses. Finally, we apply an 
additional p-p-dependent cut on the p-value in the PPFA-based analysis only. 
The distributions for these variables using PPFA definitions are illustrated 
in Figs. 10 'b), (c) and (d). 

To compare the default reconstruction and PPFA side-to-side, Figs.[TT|a) 
and (b) show the "after" distributions for m(Z,r, ^t), the visible mass of 
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Figure 11: The distribution of the visible mass, M(t, I, fir) for events with 1 and 3-prong 
taus in data (points) compared to the sum of background and signal (Z— >tt) predictions 
after applying selections utilizing variables calculated using either standard CDF recon- 
struction (a), or PPFA (b). The PPFA case includes a cut on the p- value and otherwise 
the cut values are the same, (c) S/B ratio as a function of minimal threshold M on 
M(t, I, Py). The dashed (green) line shows the standard CDF tau reconstruction and the 
solid (blue) line corresponds to PPFA; bands indicate the statistical uncertainty on the 
ratio due to the size of the sample and fluctuations in background contributions. 



lepton, tau and missing transverse energ}Q for each of the two proto-analyses. 
As a quantitative figure of merit for the comparison of the two techniques, 
in Fig. [Tl|c) we show the ratio N z ~^ TT {m > m Q )/N^ CD (m > m ), where 
N z ^ TT (m > m ) and N® CD (m > m ) are the estimated rates of events 
and background events with m(/, r^, fir) > mo, for the selected sample as a 
function of mo- Note that backgrounds are heavily dominated by the QCD 
multi-jet events. Near its maximum, the S/B ratio is a factor of 1.7 higher 
for the PPFA case. While by no means exhaustive, this comparison indicates 
the potential of the PPFA technique in discriminating hadronic tau jets from 
quark and gluon jets, thus providing a nice byproduct of the method that 
can be utilized in physics analyses. 



3 This quantity is frequently used as the final disctriminant in physics analyses [HI [S] 
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6. Conclusions 



The PPFA is a consistent, probabilistic framework designed for accurate 
reconstruction of the jet energy in the high occupancy environment, relevant 
for experiments operating in the very high luminosity regime or featuring 
coarse calorimeter segmentation. The framework is based on "first princi- 
ples" and is essentially free of ad-hoc corrections. The PPFA can be imple- 
mented in a realistic detector setting, as demonstrated using the example of 
hadronic tau reconstruction at CDF. It is shown to provide a more accurate 
jet energy measurement and better discrimination against backgrounds com- 
pared to the existing tools utilizing the particle flow concept. For hadronic 
tau reconstruction, the new tools provided by the PPFA, such as a jet-by-jet 
estimate of the jet energy uncertainty and the p-value quantifying the likeli- 
hood of the current hypothesis about particle content of a jet, can be used to 
further improve energy resolution and provide better discrimination against 
backgrounds. The proposed technique can be utilized at the LHC experi- 
ments once the machine is upgraded for the very high luminosity regime as 
well as at future collider experiments. 
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